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Abstract 

We systematically investigate the farthest distance function, farthest points, Klee sets, and 
Chebyshev centers, with respect to Bregman distances induced by Legendre functions. These 
objects are of considerable interest in Information Geometry and Machine Learning; when the 
Legendre function is specialized to the energy, one obtains classical notions from Approxima- 
tion Theory and Convex Analysis. 

The contribution of this paper is twofold. First, we provide an affirmative answer to a 
recently-posed question on whether or not every Klee set with respect to the right Bregman 
distance is a singleton. Second, we prove uniqueness of the Chebyshev center and we present 
a characterization that relates to previous works by Garkavi, by Klee, and by Nielsen and Nock. 
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1 Introduction 

Throughout this paper. 



(1) is the standard Euclidean space with inner product (•, •) and induced norm 



Suppose that S is a nonempty subset of IR^ such that for every point in R^, there exists a unique 
farthest point in S, where "farthest" is understood in the standard EucUdean distance sense. Then 
S is said to be a Klee set, and it is known that S must be a singleton; see, e.g., HJ [I6l [T7H19H201 for 
further information. (The situation in Hilbert space remains unsettled to this day.) 

In f7], Klee sets were revisited from a new perspective by using measures of fairly different from 
distances induced by norms. To describe and follow up on this viewpoint, we assume throughout 
that 



(2) / : R^ ^ ] — oo, +oo] is a convex function of Legendre type. 



Recall that for a convex function g: R^ ]— oo, +oo], the (essential) domain is domg = 
|x G R-' I g{x) G R} and x* G RMs a subgradient of at a point x G domg, written x* G dg{x), 
if (V// G R^) g{x) + {h,x*) < g{x + h); this induces the corresponding set-valued subdifferential 
operator 9^ : R^ ^ R^. (For basic terminology and results from Convex Analysis not stated explic- 
itly in this paper, we refer the reader to ||8l|23l|25l|27l.) Then g is said to be essentially smooth if g 
is differentiable on intdom g (the interior of its domain), and || Vg(a:)|| — > +oo whenever x ap- 
proaches a point in the boundary bdry domg; g is essentially strictly convex if g is strictly convex on 
every convex subset of domdg = |x G R^ | ^g{y^) 7^ 0}; and g is a convex function of Legendre type 
— often simply called a Legendre function — if g is both essentially smooth and essentially strictly 
convex. See Il4l l9l HOl l23ll for further information on Legendre functions. It will be convenient to 
set 



(3) Lf:= intdom/. 



Many examples of Legendre functions exist; however, in this paper, we focus mainly on the 
following. 

Example 1.1 (Legendre functions) The following are Legendre functions, each evaluated at a 
point AC G R^. 

(i) (halved) energy: f{x) = \ \\x\\^ = j'Ljxj. 
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(ii) f{x) = 2 {x, Ax), where A G R^^^ is symmetric and positive definite. 



(iii) negative entropy: f{x) = 

(iv) negative logarithm: f{x) 



'Ej{xj]n{xj)-Xj), ifxGR^; 
+00, otherwise. 

i-LjHxj), ifxeR^++; 
I +00, otherwise. 



Note that LT = R^ in (i) and (ii) whereas U = R+_|_ in (iii) and (iv) 



Legendre functions are of considerable interest to us because they give rise to a very nice 
measure of discrepancy between points, nowadays termed the "Bregman distance"; see, e.g.. 

Definition 1.2 (Bregman distance) The Bregman distance with respect to f, written Dj or simply D, 
is the function 



(4) 



D: R^ X R^ ^ [0,+oo] : (x,y) 



/W-/(y)-(V/(y),x-y), ifyeU; 



+00, 



otherwise. 



Although well established, the term "Bregman distance" is a misnomer because a Bregman 
distance is in general neither symmetric nor does it satisfy the triangle inequality. However, the 
Bregman distance is able to distinguish different points in the sense (see ||2l Theorem 3.7.(iv)]) that 

(5) (Vx G R^ (Vy G U) D{x,y) =0 ^ x = y. 

Example 1.3 (Bregman distances) The Bregman distances corresponding to the Legendre func- 
tions of Example 11.11 between two points x and yinR/ are as follows. 

(i) (halved) Euclidean distance squared: D{x,y) = ^\\x — y\\^. 

(ii) (halved) Mahalanobis distance squared: D(x,y) = ^ {x — y,A{x — y)). 

711 IT -11 J- T^/ N iLiixj^ixj/yj) - Xj + yj), if X G R^ and y G R^ , ; 

(m) Kullback-Leibler divergence: D{x,y) = <' ^ ' > -^i' ' -^i> _+ ++ 



(iv) Itakura-Saito distance: D{x,y) 



-co, otherwise. 

' ( ln(y/ /xy ) + Xj /yj - 1) , if X G R^+ and y G R^+; 
+00, otherwise. 
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From now on, we assume that C is a subset of such that 




Definition 1.4 (right Bregman farthest-distance function and farthest-point map) The right Breg- 
man farthest-distance function is 




and the corresponding right Bregman farthest-point map is 



1 0, otherwise. 



Observe that 

(9) Fq is convex and lower semicontinuous, 
that 

(10) dom Qc Q dom Fc Q dom/, 
and that 

(11) if C is compact, then dom Qc = dom Fq = dom/. 

We are now ready to continue the discussion on Klee sets started earlier by introducing a notion 
central to this paper. 

Definition 1.5 (D -Klee set) The set C is said to be D -Klee, if for every x E U, Qcx is a singleton. 

The asymmetry of D gives also rise to the left Bregman farthest-distance function and associated 
farthest-point map and Klee sets. These objects were analyzed in ||7| and are not treated here. 
In fact, under additional assumptions, right and left notions may be related to each other via 
duality. However, the duality approach was not powerful enough to settle the question, raised 
in IZl Remark 7.3], whether or not every D -Klee set is a singleton when f does not have full domain 
as is the case when D is, e.g., the KuUback-Leibler divergence or the Itakura-Saito distance. The 
first contribution of this paper is to settle this question entirely, for manifestations of / that are even 
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more general than those considered in l7|. In fact, in Theorem 13.21 we prove that the answer is 
affirmative in the present setting. 

Another related line of work concerns Chebyshev centers. Again, let us start by reviewing the 
classical situation in Euclidean spaces. Let S be a nonempty compact subset of M^. The Chebyshev 
center is the center of the smallest closed ball one can place in that entirely captures the set 
S. The Chebyshev center exists and is unique, and a classical result due to Garkavi and Klee 
(see Corollary l4.5l below) provides a geometric characterization of it. Unlike Klee sets, Chebyshev 
centers have already been investigated in the context of Bregman distances — see, e.g., the work 
by Nielsen and Nock |[2Tll22]| (see Corollary 14.61 below) and the references therein — however; it 
is assumed there that S is finite. The second contribution of this paper is to extend the classical work in 
Euclidean space by Garkavi and Klee and the recent work by Nielsen and Nock on Chebyshev centers of 
finite sets with respect to Bregman distances. In Theorem l4.4[ we prove existence and uniqueness for 
Chebyshev centers of compact sets with respect the Bregman distance, and we present a geometric 
characterization of it. 

The remainder of the paper is organized as follows. In Section|2]we collect and present several 
results that will make the proofs of the main results more structured and easier to follow. The 
main result in Section |3] is Theorem I3.2[ which states that every compact D -Klee set is indeed a 
singleton. In Section HI we guarantee existence and uniqueness of the D -Chebyshev center, and 
we characterize it geometrically. In Section |5l we illustrate our results with an example for three 
Bregman distances. 

2 Auxiliary Results 

In this section, we collect several results that will make the proofs of the main results easier to 
follow. We start with two identities that are straightforward consequences of @. 

Lemma 2.1 (See HH Lemma 3.1].) Let x be in R^, and let y and z be in U. Then 

(12) D(x,z) - D(y,z) = D(x,y) + (x - y, V/(y) - V/(z)) . 

Lemma 2.2 (See IB Remark 2.5].) Let x\ and x^ be in dom/, and let y\ and y^ be in U. Then 

(13) (xi - X2, V/(yi) - V/(y2)) = D{x2,yi) + D{x^,y2) - D{xi,yi) - D{x2,y2). 
Lemma 2.3 The Bregman distance D is continuous onU x U. 

Proof. This follows from p3l Theorem 10.1 and Corollary 25.5.1]. ■ 

Fact 2.4 (Rockafellar) (See Il23l Theorem 26.5].) The gradient operator V/ is a continuous bijection 
between U and intdom /*, with continuous inverse (V/)^^ = V/*. Furthermore, f* is also a convex 
function ofLegendre type. 
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Recall that a function g: — ^ ]— co, +00] is coercive if all its lower level sets are bounded; 
equivalently, if lim||;^||^^ooS^(^) = The following is thus clear. 

(14) If ^: — > ]— 00, +00] is coercive and lower semicontinuous, then argming ^ 0. 
Here argmin g denotes the set of minimizers of g. 

Fact 2.5 (See [23, Corollary 14.2.2].) Let g: RJ ^ ] —CO, +00] be convex, lower semicontinuous, and 
proper, and let x* S R-^. Then g{-) — {■ ,x*) is coercive if and only ifx* S intdom g*. 

Fact 2.6 (loffe-Tikhomirov) (See Il27l Theorem 2.4.18].) Let A be a compact Hausdorjf space, let 
ga- ]— 00, +00] be convex for every a E A, and set g := sup^^^ ga- Assume that (Vx S R-^) 

A ]— oo,+oo] : a ga{x) is upper semicontinuous and that xq G dom g is a point such that (Va G A) 
ga is continuous at xq. Then 

(15) dg{xo)=corw \J dga{xo). 

{aGA I g{xo)=ga{xo)] 

Lemma 2.7 (See IH Proposition 3.16].) Suppose that C is closed and convex, and let y E U \ C. Then 
there exists a unique point c E C such that 

(16) (Vc G C) (c - c, Vf{y) - Vf{c)) < 0. 

Lemma 2.8 Suppose that C is compact, and let x E !J \ ((V/*)(conv (V/(C)))). Then there exists 
ye (V/*)(conv(V/(C))) C U such that 

(17) (Vc G C) D{x,c) > D{x,y) + D{y,c). 

Proof. Set S := V/(C) and V := intdom /* = \/f{U). Since C is compact and V/ is continuous 
(Fact 12.4) ). the set S is compact. Using Il23l Theorem 17.2], we deduce that convS = conv S is a 
nonempty proper compact subset of V. Using Fact 12.41 again, we see that 

(18) (V/* ) (conv S) is a proper compact subset of U 

and that x* := V/(x) G V \ (convS). Applying Lemma IZTl (to /*, conv S, and x*), we obtain a 
point y* G conv S such that 

(19) (VyGcorwS) {v - y* ,Vf*{x*) - Vf* {y*)) < 0. 
Now set y := V/* (y*). Then (O yields 

(20) (Vc G C) ( V/(y) - V/(c), x - y) > 0. 
Combining this with Lemma [2^1] we estimate 

(21) (VcGC) D{x,c)-Diy,c) = D{x,y) + {Vf{y)-Vfic),x-y) >D{x,y), 
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which completes the proof. 



Let X and Y be nonempty subsets of and let A : X =t Y be a set-valued operator, i.e., (Vx G X) 
Ax C Y. Denote the graph of A by gr A := { G X x Y | y G Ax}. We say that A is monotone 
from X to Y, if 

(22) (V(x,x*) GgrA)(V(y,y*) GgrA) (x - y,x* - y*) > 0. 

If A is monotone from X to Y and every proper set-valued extension from X to Y is not monotone, 
then A is maximal monotone from XtoY.liX = Y = IR^, we will simply speak of monotone and 
maximal monotone operators; this is the usual and well known setting. 

We now present a variant of Il24l Example 12.7], which is a sufficient condition for maximal 
monotonicity. 

Proposition 2.9 Let O be a nonempty open subset o/R^, let Y be a subset o/R^, and let A: O ^ Y be 

monotone and continuous. Then A is maximal monotone from O to Y. 



Proof. Suppose that (x, y) G O x Y satisfies 

(23) (VxgO) (x-x,y- Ax) > 0, 

and denote the closed unit ball in R^ by B. Then for all sufficiently small e > 0, we have x + 
eB Q U and hence {\/b G B) (x - {x + eb),y - A{x + eh)) > and so {b,y - A(x + eb)) < 0. 
Letting e — > 0+ for fixed but arbitrary b E B, and using continuity of A at x, we deduce that 
{b,y — Ax) < 0. Supremizing this last inequality over b E B, we obtain ||y — Ax|| = 0. Hence 
(x,y) = (x. Ax) G gr A, as required. ■ 

Our first result reveals a monotonicity property of Qc- (See also Il26ll and (Zl Proposition 7.1], 
and mi, where we discuss Chebyshev sets instead of Klee sets.) 

Proposition 2.10 The set-valued operator —V/ o : R^ ^ R^ is monotone. 



Proof. Assume that (x, x*) and {y,y*) lie in gr Qq. It follows from ((8|) and Lemma l2!2l (applied to 
xi = X, X2 = y, yi = y*, and yi = x*) that 

(24) 0< (D(x,x*)-D(x,y*)) + (D(y,y*)-D(y,x*)) 

= (x-y,V/(y*)-V/(x*)) 
= (x-y,(-V/)(x*)-(-V/)(y*)), 

as required. ■ 

Proposition 2.11 Suppose that C is closed, and that {{x„,yn))ne]N a sequence in (gr Qc) fl (L7 x R^) 
such that {xn,yn) ^ (x,y) G LZ x R^. Then (x,y) G gr Qc- 
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Proof. Since ran Qc C C, the sequence (t/,;)neN lies in C and it satisfies (Vn G N) D{x„,yn) = 
Fc(^n)- Because C is closed, y G C C LZ. By Lemma |23l D is continuous on L7 x LJ. In view of ([9]), 
we deduce altogether 

(25) fc(^) < lim fc(x„) = lim D(x„,i/„) = D(x,i/) < fc(x). 

Therefore, fc(x) = D{x,y), i.e., y G QcC^:). ■ 
Proposition 2.12 Suppose that C Q U. Then gr Qc C gr Qq. 

Proof. Take (x,y) G gr Qc- Then y G C C C and (Vc G C) D(x,y) > D(x,c). Since C C U 
and D(x, ■) is continuous on U (Lemma |2.3|) , it follows that (Vc G C) D(x,y) > D(x, c). Thus, 
y G Qc(x). ■ 

Remark 2.13 Assume that c G C fl bdry U. In view of (HJ, there exists a sequence (c„)„g]N in 
C Q U such that c„ c; hence, by l2l Theorem 3.8. (i)], (Vx G Lf) D{x,c„) +oo. Therefore, the 
assumption that C be a subset of U is very natural in Proposition |2]12] and elsewhere in this paper. 

Proposition 2.14 Suppose that (Vz G dom/) D{x, ■ ) is convex on U. Then gr Qc C gr QconvC- 

Proof. Take (x,y) G gr Qc. Then x G dom/, y G C C convC, and (Vc G C) D{x,c) < D(x,y). 
Now let z G conv C, say z = ^^"^^ A,c,, where each A, G [0, 1], each c, G C, and A, = 1. Then 

C(x,z) < A,D(x,c,) < A,D(x,y) = D(x,y) and therefore y G Qconvc(^)- ■ 
Proposition 2.15 Suppose that (Vx G dom/) D(x, •) fs strictly convex on U. Then gr Qc = 

gr QconvC- 

Proof. In view of Proposition 12.141 we only need to show that gr QconvC Q gr Qc- To this end, 
let (x,y) G gr QconvC- Then x G dom/, y G convC, and (Vs G convC) D(x,s) < D(x,y). In 
particular, (Vc G C) D(x, c) < D(x, y). The proof is complete as soon as we have verified that 
y G C. Assume to the contrary that y ^ C. Then y = /ViCi, where n > 2, each A, > 0, each c; G 
C, and where the c, are pairwise distinct and X!i"=i = 1- But then D(x,y) < ^[Lj A,D(x, c,) < 
XiiJLi A;D(x,y) = D(x,y), which is absurd. ■ 

The next result shows that when D is separately convex (see O for a systematic discussion of 
separate and joint convexity of D), then the farthest-point distance is "blind" to the convex hull. 

Proposition 2.16 Suppose that (Vx G dom/) D(x, - ) is convex. Then f convC = f c- 

Proof This follows from El Theorem 32.2]. ■ 
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3 Klee Sets are Singletons 



The following result will be critical in the proof of our first main result (Theorem 
Theorem 3.1 Suppose that C is compact. Then argmin F qIs a nonempty subset ofU. 

Proof. By dom fc = dom/. Since C C !J, it follows from FactlHthat V/(C) C V/(LJ) = 
int dom /* . In view of Fact l2.5[ we deduce that 

(26) (Vc G C) /(■)-(■' V/(c)) is coercive. 

Since (Vc e C) D(-,c) = (/(■)-(• /V/(c))) + ((c,V/(c)) - /(c)), it follows that 

(27) (Vc e C) D( ■ , c) is coercive. 
In turn, this implies that 

(28) f c ( ■ ) = sup D ( ■ , c) is coercive, convex, lower semicontinuous, and proper. 

ceC 

In view of ((28)) and ((14)) , argmin Fc ^ 0- Let 

(29) xo G argmin Fe- 
lt suffices to show that 

(30) Xo e U. 

Assume to the contrary that Xq ^ U. In view of ((TO)) and ((29)) , Xq G (dom/ \ U) C bdrydom/. 
Now fix an arbitrary point Xi G LT and set 

(31) (VeG]0,l[) x,:= {l-e)xo + exi. 

By 1231 Theorem 6.1], (Ve G ]0, 1]), x^ G U. Set S := V/(C). As already observed in the proof of 
Lemma IZSl conv S = conv S is a proper compact subset of int dom /*. Thus, there exists e G ]0, 1] 
such that (Ve G ]0,e]) Xf G Lf \ (V/*)(convS). Lemma [22] now yields 

(32) (VeG]0,e])(3y, G (V/*)(corwS))(VcGC) D{x„c) > D{xe,ye) + D{y,,c). 

On the one hand, while / is not necessarily continuous at xq, it is at least continuous along the line 
segment [xq, xi] (see Il23l Theorem 7.5]); consequently, 

(33) lim/(x,)=/(xo). 

On the other hand, the net (i/<r)ee]o,g] li^s in V/* (conv S), which is a compact set. After passing to 
a subnet and relabeling if necessary, we assume that there exists a point yo G such that 

(34) lim 1/f = 1/0 e V/*(corwS) C U. 
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Combining ((33)) and (|34)l , invoking Lemma I2.3[ and taking the limit in (p2]l , we obtain altogether 
that 

(35) (Vc G C) D(xo,c) > D(xo,yo) +D(yo,c). 

Since Xq G bdrydom/ and 1/0 £ intdom / = (i, (O results in D{xo,yo) > 0. Supremizing ((35|) 
over c G C, we deduce that 

(36) F^cixo) > D{xo,yo) + fc(i/o) > fc(i/o), 

which contradicts ((29|) . Therefore, we have verified (|30|), and the proof is complete. ■ 

Theorem 3.2 (every D -Klee set is a singleton) Suppose that C is compact and that C is D -Klee. Then 
C is a singleton. 

Proof. Recall that 

(37) F;:( ■)= sup D( c) = sup ((/(•)-(■ , V/(c)) ) + ( (c, V/(c)) - /{c] 

CGC CGC ^ 

Because C is D -Klee, if x G U, then Qqx is the unique point in C such that f c(^) = D{x, Qqx) 
and (Vc G C \ {Qc^}) f c(^) > D{x,c). In view of Theorem l3.1l we take xq G argmin Fq C U. 
Using the Fact 12.61 we obtain 

(38) G d'Fcixo) = V/(xo) - V/(QcXo). 

Hence V/(zo) = V/(Qc(xo)) and thus xq = Qc{xo). Therefore, C = {xq}. ■ 

Corollary 3.3 (Klee) Suppose that C is compact Klee set with respect to the Euclidean distance. Then C 
is a singleton. 

Proof. (See also HH.) This follows from Theorem l3.2l when / = 2 1| ■ |P- B 

We conclude this section with two results concerning D -Klee sets that are not assumed to be 
compact. When considering classical Klee sets, a standard assumption is closedness. The next result 
illustrates this assumption in the present Bregman distance setting. 

Proposition 3.4 Suppose that C is a compact subset of U, and that U C dom Qq. Then C is D -Klee if 
and only ifC is D -Klee and Qc is continuous on U. 

Proof. "=>": Since C is compact. Theorem I3.2l implies that C is a singleton, say C = {y}. But then 
C = {y} = C is also D -Klee, and Qc\u = {y} is clearly continuous on U. 
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': Proposition 12. 101 implies that both —V/ o Qq\^ and —V/ o Qc|^ are monotone from U 
to R^. Furthermore, since Qc is continuous on U, so is —V/ o Qcj^j- Thus, by Proposition 12.91 
~ V/ o Qq I is maximal monotone from U to R^. On the other hand, Proposition l2.12l implies that 

(39) gr(-V/oQc|^) Cgr(-V/oQ^|^). 

Altogether, -V/oQc|j^ = -Vf o Q-^\^, which yields Qc|^ = Qc\u- ^^^^ Qc|^ is single- 
valued, so is Qclu- Therefore, C is D -Klee. ■ 

If the underlying Bregman distance D is strictly convex in the second variable, then we obtain 
the following result. 

Proposition 3.5 Suppose that (Vx E U) D{x, ■ ) is strictly convex on U. Then convC is D -Klee if and 
only if C is D -Klee. 

Proof. This is an immediate consequence of Proposition l2.15l ■ 



4 Characterization of Chebyshev Centers 

The proof of our second main result (Theorem l4.4|) relies upon the next two results. 

Proposition 4.1 Suppose that C is compact. Then Fc is proper, lower semicontinuous, and convex, with 
dom T c = dom/ = dom Qq. Furthermore, Fq is strictly convex on dom 9 Fc = intdom f = U. 

Proof. We observed already (see Q and fTTl l) that Fc is convex and lower semicontinuous, and 
that dom Fc = dom/ = dom Qq. Hence Fc is proper. Now set 

(40) ^: ^ ]-oo,+oo] : x ^ max ( (c- x, V/(c)) -/(c)), 

and note that g is convex with domg = R^ = int dom dg (see Il23l Theorem 23.4]). Furthermore, 

(41) 'Fc=f + g. 

By the subdifferential sum rule (see Il23l Theorem 23.8]), we have 9Fc = df-\-dg and hence 

dom9 Fc = dom(3/) n dom(9g) = dom(9/) n R^ = dom 9/. On the other hand, since / is 
a Legendre function, it follows from 11231 Theorem 26.1] that dom 9/ = intdom /. Altogether, 

dom9 Fc = intdom f = U. Using once more the assumption that / is a Legendre function, we 

have that / is strictly convex on int dom f = U, and therefore so is F c = / + H 
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Recall that for a proper convex function g : ]— oo, +00], the directional derivative of g at 

X G dom g in the direction /j G RMs defined by 

(42) g'fe>,)=Um /'" + "'/-^W . 

Theorem 4.2 (directional derivative) Suppose that C is compact, let x G dom/, and Zef h G R^. T/ien 

(43) 'Fci^;h) = sup - {KVf{y)) \yeQc{x)}. 
Ifx ^ Uandx + h G JJ, i/zen ~F^{x;h) = -co. 

Proof. Recall that dom F c — dom/ = dom Qc by Proposition |4?T1 so let y G Qc(^)- Then 

(44) (Vf>0) ~Fc{x + th)>D{x + tKy)= f{x + tK)-f{y)-{x + th-y,Vf{y)) 
and 

(45) f'c(x) = D{x,y) = f{x) - f{y) - {x - y,Vf{y)) . 

Hence, (Vf > 0) fc(^ + - fc(^) > /(^ + f^) - /(^) - (f^/ V/(y)). Dividing by i and taking 
the infimum over t > yields 

(46) 'F^{x;h)>f\x-h)-{h,Vf{y)). 
Supremizing over 1/ G Qc(^) yields 

(47) f'ax;/2) > sup{/'(x;/2) - (//,V/(y)) | y G Qc(^)}. 

If [x, X + //] n dom/ = {x}, then f'{x;h) = +00; hence, (f43|) follows from (|47)) . Thus, we assume 
that [a:, X + /z] n dom/ contains a nontrivial line segment. Let (in)neN be a sequence in ]0, 1[ such 

that tn 0+ and {x + t„h)n£]N lies in dom/. Furthermore, for every n G N, let c,, G Qc(^ + ^n^)- 
After passing to a subsequence and relabeling if necessary, we also assume that Cn ^ c ^ C. Then, 
for every n G N, 

(48) ^ix + lnh) = D{x + tnh,Cn) = f{x + tnh)-f{Cn) " {x + t„h - C„,V f {Cn)) 

and Fc{x) > D{x,c„) = f{x) — f{c„) — (x — c„, V/(c„)); consequently, 

(49) %{x + t.h)-Fcix) ^ f{x + t„h)-fix) _ ^^^^^^^^^^ 

tfi tfi 



Letting n —>■ +00 in (|49|l . we deduce that 

(50) 'Fcix;h) < f{x;h) - {h,Vf{c)) . 

On the other hand, using line segment continuity of / and Fq s^t x (see [231 Corol- 
lary 7.5.1]), and continuity of both / and V/ on U, we see that letting n — > +00 in ((48|) 
yields Fc{x) = D{x,c). Hence c G Qc{x). It thus follows from (|50)) that F ^{x;h) < 

sup {/'(x;/i) - {h,Vf{y)) \ y G Qc(:c)}. Combining this with (|47)) , we deduce that (gill holds. 

The 'Tf " statement follows from (gUl and 1231 Theorem 23.3]. ■ 
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Theorem 4.3 (subdifferential) Suppose that C is compact, and let x G JJ. Then 

(51) d%{x)=Vf{x)-corwVf{Qc{x)). 

Proof. By Theorem 14.21 and Il23l Theorem 23.4], F^(x; ■) is the support function of both V/(x) — 
V/(Qc(^)) arid dFc{x). Therefore, the latter set (which is closed and convex already) is the 
closed convex hull of the former set. Since Qc(^) is a compact subset of U by Proposition 12. 1 11 
it follows from the continuity of V/ on U and from 1231 Theorem 17.2] that conv V/(Qc(^)) = 
conv V/( Qc (^) ) • This completes the proof. ■ 

Theorem 4.4 (uniqueness and characterization of the D -Chebyshev center) Suppose that C is 
compact. Then Fq has a unique minimizer x G dom/, called the D -Chebyshev center of C, and 
characterized by 

(52) X G V/* ( conv V/( Qc 

Proof. Theorem l3. 1 1 states that argmin F c is a nonempty subset oiU. In view of the strict convexity 
of F c on U (Proposition 14. 1|), argmin F c is a singleton, say {x}. By Theorem 14.31 G dF c{x) = 
Vf{x)- conv V/( Qc (^) ) and thus V/(x) G conv V/( Qc (x) ) . Now apply FactlHl ■ 

Corollary 4.5 (Garkavi-Klee) (See Ill5]| and also [|18|.) Suppose that C is compact and that x G K^. 
Then x is the Chebyshev center ofC with respect to the Euclidean distance if and only if 

(53) x G conv Qcix). 

Corollary 4.6 (Nock-Nielsen) (See II22II and also |[2T]| .) Suppose that C is finite. Then the D - 
Chebyshev center ofC is the unique point x eU characterized by 

(54) xG V/*( conv V/(Qc(^ 



Corollary 4.7 Suppose that C is compact and that it contains at least 2 points, and let x E U be the 
D -Chebyshev center ofC. Then Qc{x) must contain at least 2 points. 

Proof. Suppose to the contrary that Qc{x) is a singleton. Then (|52)) implies that Qc{x) = {x}, i.e., 
that X is its own farthest point in C. In view of ((5|) and the assumption that C contains a point 
different from x, this is absurd. ■ 



5 Constructing and Visualizing Chebyshev Centers 

We work in the Euclidean plane, i.e., we assume that J = 2, and we let D be the halved Euclidean 
distance squared, the KuUback-Leibler divergence, or the Itakura-Saito distance (see Example |1.3l ). 
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Set 
(55) 
and 
(56) 

Furthermore, we assume that 



Co = (l,fl) and ci = (fl,l), where flG]l,+oo[, 
(VAeR) ca = (1- A)co + Aci. 



(57) C = conv{co,ci} = {ca | A G [0, 1] } = { (l - A + Aa, (1 - A)fl + A) | A G [0,1]} 
= {((fl-l)A + l,(l-fl)A + fl) I AG [0,1]}. 

Note that C C C U, and that C is compact and convex. In view of Theorem 14.41 the D 

Chebyshev center z of C is characterized by 



(58) 



G Vr(convV/(Qc(z))). 



Our aim in this section is to determine z and related objects, and to visualize them. It will be 
convenient to set 



(59) 

Proposition 5.1 z G A. 



A = {(x,x) I :c G R}. 



Proof. For x = {x-[,X2) G R^, set xT = (x2, xj). Observe that for the choices of D considered in 
this section, (Vx G R2)(Vy G R^) D(x,y) = D(xT,yT) and that CT = {cT | c G C} = C. Thus, 

(Vx G R^) f c(x) = Fc(x^). Since z is the unique minimizer of F c, we must have that z = z""", i.e., 
that z G A. ■ 



Example 5.2 (halved Euclidean distance squared) Suppose D is as in Example [L; FT) and let x 

(xi,X2) G R^. Then 



(60) Qc(x) 
andz = ci/2 = {^{l + a),l{l + a)) . 



'{co}, ifx2<xi; 
{ci}, if xi < xr, 
^{co,ci}, ifXi = X2, 



Proof. Set 

(61) dx: K ^ [0,+oo[ : A D(x,Ca; 

Then for every A G R, we have 



(62) 



rf^(A) = [a-iyX^ + {l-a){xi-X2 + a-\)\ 



(xi-l)2 + (x2-fl) 
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(63) 



d^(A) = (xi-X2-l + fl)(l-fl)+2A(l-fl)2and 4'(A) =2{a-l)^. 



Hence Qc(x) Q {co, ci}. Since dx(0) — dx{l) = (1 — a){x2 — xi), we obtain (|60)) . Furthermore, 
since C is convex and Ci/2 G A, we have Ci/2 G C = conv{co, Ci} = conv Qcici/i)- Therefore, 
the characterization ([581) of z yields z = ci/2. (Alternatively, one may verify that Ci/2 is the unique 
minimizer of the function A [0, +oo[ : (x, x) i-^ t^(.T,,r)(0) = ^ c(^/^)-) ^ 



Example 5.3 (Kullback-Leibler divergence) Suppose D is as in Example [Lc fiii) and let x = 

(xi, X2) G U. Then 



(64) 

and z = (a/a, v^). 

Proo/. Set 
(65) 



Qc(x) 



'{co}, ifx2<xi; 
{ci}, if xi < X2; 
^{co,ci}, ifXi = X2, 



dx: K ^ [0,+oo] : A D(x,ca). 



Then domdx = {A G IR | Ca G LI} = ]— 1/ {a — l),a/{a — 1) [ D [0, 1]. For every A G domdx, we 
have 

(66) dx(A) = -xi In ^ ^ ) - Xi + 1 - X2 In ( ^ ^ ) - X2 + fl. 



3^1 



X2 



(67) 
and 
(68) 



<(A) 



4'(A) = 



Xi(fl — 1) X2(l — fl) 



(fl-l)A + l (l-fl)A + fl' 



Xi(fl-l)^ 



X2(l - fl)^ 



((fl-l)A + l)^ ((l-fl)A + fl)^ 



> 0. 



Thus, dx has no local maximizers in domdx and therefore Qc(x) CI {cq, Ci}. Because of 

(69) D(x,co)-D(x,ci) =dx(0)-dx(l) = (xi - X2) In(fl), 
we see that ((64)) must hold. Finally, ((64)) implies that 

(70) (v/^,v^) = (^exp(i(0 + ln(fl))),exp(i(ln(fl)+0))^ 

= (expxexp)(i(ln(l),ln(fl)) + l(ln(fl),ln(l); 
= Vr(lV/(co) + iV/(ci)) 
G Vr(convV/(Qc(v^.v^; 
In view of the characterization (|58l ) of z, we deduce that z = {\/a, \/a ) . 
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Remark 5.4 



(i) The fact that the extreme points {cq/Ci} play a role in Example 15.21 and Example 15.31 is not 
surprising since in these cases D(x, ■) is convex for every x G JJ (see, e.g., ||3l) so that l|23l 
Corollary 32.3.2] applies. 

(ii) Note that z is the arithmetic mean of cq and ci when D is the halved Euclidean distance 
squared (Example l5.2|) . and that z is the geometric mean of cq and ci when D is the KuUback- 
Leibler divergence (Example I5.3|) . This might nurture the conjecture that z is the harmonic 
mean of cq and ci for the Itakura-Saito distance — depending on the location of a, this is 
sometimes but not always the case (see Example 15.51 and Lemma |53)) . 



Example 5.5 (Itakura-Saito distance) Suppose that D is as in Example POj^iv) Set 



(71) g = g{a) = ,\ ,,n n ^ M and h = h{a) - 



Then 



(72) z =!<'■'''>' and QcW= J'"''''' , J *' 

[ig'g)' iig>h; [{co,ci/2,ci}, itg>h. 

Proof. Set 

(73) g={g,g) and h={h,h), 
and note that a straightforward computation yields 

(74) AnV/*(convV/({co,ci})) = {h}. 
Let X = {x,x) E U D A and set 

(75) dx: K ^ [0,+oo]: A D(x,Ca). 

Then domdx = {A G R | ca G LZ} = ]— 1/ {a — l),a/{a — 1) [ D [0, 1]. For every A G domdx, we 
have 

(76) dx(A)=ln ^ ^ +^_^_^+ln A 1 + 



(fl-l)A + l V X J (l-fl)A + fl 



and hence 



(77) 4(A) = , ^^"-^^ ,+ i-'^ ^^'-'^^ 



(fl-l)A + l + (l-fl)A + fl ((i_fl)A + fl)^ 

We note in passing that an elementary calculation results in 

(78, 4,„)_4(.).,„(_^) + (i5_Il!),. 
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Now observe that d'^{^) = and that d'^{A) in ((77)) is also a quotient of two polynomials (in 
A), where the numerator is a polynomial of degree 3 or less. Thus, d'^ has at most two further 
roots different from ^, which would have to be centered symmetrically around ^ because of the 
symmetry of d^ about ^. Furthermore, rfx(A) — ^ +oo as A approaches either boundary point of 
domdx- Hence, critical points of d^ that are different from ^ cannot be local maximizers. Therefore, 
Qc(x) CI {co, ci/2, ci}. The symmetry of D and C yields that exactly one of the following holds: 
Qc(x) = {ci/2}, Qc(x) = {co,ci}, or Qc(x) = {co,ci,ci}. Combining this with ([ZHll, we obtain 
the equivalences 

(79) Qc(x) = {co,ci} O 4(0)-dx(2) >0 O x>g. 

Let us now turn to the D -Chebyshev center z of C. Since z G A (Proposition I5.1|) and Qc(z) 
must contain at least 2 points (Corollary I4.7|) . we write z = (z, z) and we deduce that either 

Qc(z) = {co, ci} or Qc(z) = {co,Ci/2, ci}. In turn, this means that exactly one of the following 
two cases holds. 

(Case 1) Qc(z) = {co,ci}. 



or 

(Case 2) Qc(z) = {co,ci/2,ci}. 

If (ICase ID holds, then PropositionEIl (O, and ^ yield that z = h and that z > g. Thus, 

(80) HCase ID => z = h > g. 
Using ((78ll , we obtain the implication 

(81) dCase 21) ^ z = g. 



We now assume momentarily that g < h. Then, by (|79)) , Qc(h) = {co,ci} and hence h G 
V/*(conv VQc(h)) by ((74)) . In view of the characterization ((58)) of z, we obtain z = h and hence 
z = h. We thus have verified the first case of ((72)). 



Finally, we assume that g > h. In view of ((80)) , ((Case 1)) cannot hold. Thus, ((Case 2)) must hold 
and ((81)) yields that z = i.e., that z = g. ■ 



The formula for z given in Example l5.5l immediately raises the question on how g and h relate to 
each other, viewed as functions of a. In the following result, we provide an alternative description 
of the inequality g < h. 

Lemma 5.6 Let the functions g and h be defined on the interval / = ]1, +00 [ hy 
(82) ^W = ^^l-f^^l K-) = ^. 
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Then there exists a real number a & I such that 

{g{x) < h{x), ifx < a; 
g{x)=h{x), ifx = a; 
g{x) > h{x), ifx > a. 

In fact, a 17.63. 



Proof. Observe that 
(84) 



, / ^ . N 2(x-l)2 , 
h{x) > g{x) O j^^:^ > In 



(x + 1)^ 
4x 



21n(x- 



ln(4x) 



Since 
(85) 



O k{x) := - 21n(x + 1) + ln(4x) > 0. 

[x + Ij 

8(x-l) 2 1 _ -(x-l)(x2-6x + l) 

' (X + 1)3 X + 1 ~^ X ~ X(X + 1)3 

-(x-l)(x- (3-2V2))(x- (3 + 2v/2)) 
x(x + l)3 



we set ^ = 3 + 2\/2 5.83, and we deduce that k is strictly increasing on ] 1, ^] and that k is strictly 
decreasing on +oo[. On the other hand, k{l) = and limx^+cok{x) = — oo. Altogether, there 
must exist some number a > ^ such that > on ]l,fl[, k(a) = 0, and A: < on +oo[. In 
view of (|84)) , we obtain (|83)l . Finally, the proclaimed approximation a ?a 17.63 follows from Maple, 
Mathematica, or by simple bisection. ■ 

Remark 5.7 Consider again Example 15.51 and its notation. Define numbers according 
to the following two alternatives: 



(86) 



or 



g<h 



Ho = = 2' 



Fl/2 = 0, 



(87) 



g>h 



(fl-l)2-2flln 



i"o = Hi 



4fl 



(fl-l)2ln 



4fl 



-2(fl-l)2 + (fl + l)2ln 



Fl/2 



4fl 



(fl-l)2ln 



4fl 
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One may verify that {fiQ, Hi} C [0, 1], that }Iq + + }ii = 1, and that 
(88) z = vr (hoV/(co) + f/i/2V/(ci/2) + FiV/(ci)). 

Note that the existence of such convex coefficients is guaranteed by ((58)) . 

Remark 5.8 Figured] shows the set C, the Chebyshev center z of C, and the corresponding sphere 
of radius F c[z.) centered at z, for a variety of values of a (fixed within each row) and for each 
of the three distances analyzed (fixed within each column). Specifically, shown are a = 4 and 
fl = 8 over the region R = [0, 10] x [0, 10] (top two rows), and a = 16 and a = 32 over the region 
R = [0,50] X [0,50] (bottom two rows). Each are shown over color-maps indicating Fci'X-) for 
each X G R, with the interpretation of the colors indicated in the accompanying color-legend. 
Note that the colors indicate distances from each point in the specified region to the farthest point 
in C, but are only relative comparisons within each graph; the same color in separate images does 
not indicate the same numerical magnitude, neither for a fixed distance D nor for a fixed value 
of fl. In addition, the color-maps for the halved Euclidean distance squared and the KuUback- 
Leibler divergence were calculated using Qc(x) in Examples 15.21 and I5.3[ respectively. However, 
the color-map for the Itakura-Saito distance was calculated numerically by a discretization of C 
due to the absence of a corresponding formula for Qc(x) in Example 15.51 We make the following 
observations directly from Figure [H 

(i) As predicted by our analysis, for the halved Euclidean distance squared, z falls on the point 
Ci/2 for all values of a (left-column). The color-map corresponds to max{D(x, cq), D(x, Ci)}, 
withD(x, Co) = D(x, ci) along A as per ((60)) . 

(ii) For the Itakura-Saito distance and for small a (see a = 4 and a = 8), the endpoints cq and 
ci are the farthest points from the Chebyshev center {h,h). When a > a (see Lemma [5.6|l , 
then the farthest points from {g,g) are {cq, ci/2, ci}, and D{{g,g), C1/2) < Fc{h,h), visually 
confirming that {g,g) is now the Chebyshev center (see Figure[T]for a = 32). 

Remark 5.9 Finally, let us fix x = (1, 1) and assume that a = 6. For the Itakura-Saito distance, we 
have that the farthest point Qc(x) is Ci/2, which is actually the nearest point of C to x for both of 
the other distances. Indeed, Figure |2] shows the spheres for the Itakura-Saito distance for a variety 
of radii. The thickness of the line segments is plotted proportional to the distance from x. (In 
addition, note that the Itakura-Saito ball is convex for small a, a fact not apparent in Figure [TJ) 
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Euclidean Kullback-Leibler Itakura-Saito 




a = 32 



Figure 1: The set C, the Chebyshev center z of C, and the sphere of radius Fc(z) centered at z, for 
C the line segment connecting (1, a) and {a, 1) for a = A and a = 8 over the region [0, 10] x [0, 10]; 
and fl = 16 and a = 32, over the region [0,50] x [0,50]. Each are shown over color-maps for 
the three distances analyzed in Section |5l with the interpretations of the colors indicated in the 
color-legend. 
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Figure 2: Spheres for the Ikaturo-Saito distance centered at x = (1, 1), for a variety of radii. Also 
shown are the line-segments C from (1, fl) to (fl, 1) for fl = 2, 4, 6, 8, with plot intensity proportional 
to the distance from x. 
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